Filtering Relevant Text Passages Based on Lexical Cohesion

نویسندگان

  • Matthias Priebe
  • Clemens H. Cap
چکیده

Monitoring news and blogs has become a promising application for global operating groups, who are interested in recognizing topic developments in a fragmented topic landscape. News articles especially long ones may consist of several topics or different aspects of the same topic. In terms of Topic Detection and Tracking (TDT) it is hard to figure out the topic development in a stream of news or blog articles with the scope of a certain information need since articles often contain only a limited amount of the relevant information. In this paper we address the problem of filtering relevant portions of text, commonly known as passage retrieval, by using linear text segmentation methods based on lexical cohesion. We present two strategies for passage retrieval and compare their performance with cohesion based approaches – TextTiling (cf. [Hea97]) and TSF (cf. [KG09]) – developed in the context of linear text segmentation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On document relevance and lexical cohesion between query terms

Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms’ occurrences in a document is related to its relevance to the query. Lexica...

متن کامل

A Study of Document Relevance and Lexical Cohesion between Query Terms

Lexical cohesion is a property of text, achieved through lexicalsemantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Experim...

متن کامل

Biased LexRank: Passage retrieval using random walks with question-based priors

We present Biased LexRank, a method for semi-supervised passage retrieval in the context of question answering. We represent a text as a graph of passages linked based on their pairwise lexical similarity. We use traditional passage retrieval techniques to identify passages that are likely to be relevant to a user’s natural language question. We then perform a random walk on the lexical similar...

متن کامل

Trade-Off between Factors Influencing Quality of the Summary

Our summarization approach is based on the assumption that quality of the summary is influenced by a set of factors, dependent on lexical and grammatical features of text units selected and arranged while composing the summary. The system has been developed with taking into account six factors influencing the final quality: compliance with the genre "summary", relevance, focusing, compliance wi...

متن کامل

Lexical Cohesion Based Topic Modeling for Summarization

In this paper, we attack the problem of forming extracts for text summarization. Forming extracts involves selecting the most representative and significant sentences from the text. Our method takes advantage of the lexical cohesion structure in the text in order to evaluate significance of sentences. Lexical chains have been used in summarization research to analyze the lexical cohesion struct...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010